x86/mm: avoid inadvertently degrading a TLB flush to local only
authorDavid Vrabel <dvrabel@amazon.co.uk>
Tue, 7 Jun 2022 11:59:31 +0000 (13:59 +0200)
committerJan Beulich <jbeulich@suse.com>
Tue, 7 Jun 2022 11:59:31 +0000 (13:59 +0200)
commit7c003ab4a398ff4ddd54d15d4158cffb463134cc
tree406a314de03d03d4c49e275152f02fbb8e4afb22
parentb378ee56c7e0bb5eeb35dcc55b3d29e5f50eb566
x86/mm: avoid inadvertently degrading a TLB flush to local only

If the direct map is incorrectly modified with interrupts disabled,
the required TLB flushes are degraded to flushing the local CPU only.

This could lead to very hard to diagnose problems as different CPUs will
end up with different views of memory. Although, no such issues have yet
been identified.

Change the check in the flush_area() macro to look at system_state
instead. This defers the switch from local to all later in the boot
(see xen/arch/x86/setup.c:__start_xen()). This is fine because
additional PCPUs are not brought up until after the system state is
SYS_STATE_smp_boot.

Signed-off-by: David Vrabel <dvrabel@amazon.co.uk>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
x86/flushtlb: remove flush_area check on system state

Booting with Shadow Stacks leads to the following assert on a debug
hypervisor:

Assertion 'local_irq_is_enabled()' failed at arch/x86/smp.c:265
----[ Xen-4.17.0-10.24-d  x86_64  debug=y  Not tainted ]----
CPU:    0
RIP:    e008:[<ffff82d040345300>] flush_area_mask+0x40/0x13e
[...]
Xen call trace:
   [<ffff82d040345300>] R flush_area_mask+0x40/0x13e
   [<ffff82d040338a40>] F modify_xen_mappings+0xc5/0x958
   [<ffff82d0404474f9>] F arch/x86/alternative.c#_alternative_instructions+0xb7/0xb9
   [<ffff82d0404476cc>] F alternative_branches+0xf/0x12
   [<ffff82d04044e37d>] F __start_xen+0x1ef4/0x2776
   [<ffff82d040203344>] F __high_start+0x94/0xa0

This is due to SYS_STATE_smp_boot being set before calling
alternative_branches(), and the flush in modify_xen_mappings() then
using flush_area_all() with interrupts disabled.  Note that
alternative_branches() is called before APs are started, so the flush
must be a local one (and indeed the cpumask passed to
flush_area_mask() just contains one CPU).

Take the opportunity to simplify a bit the logic and make flush_area()
an alias of flush_area_all() in mm.c, taking into account that
cpu_online_map just contains the BSP before APs are started.  This
requires widening the assert in flush_area_mask() to allow being
called with interrupts disabled as long as it's strictly a local only
flush.

The overall result is that a conditional can be removed from
flush_area().

While there also introduce an ASSERT to check that a vCPU state flush
is not issued for the local CPU only.

Fixes: 78e072bc37 ('x86/mm: avoid inadvertently degrading a TLB flush to local only')
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 78e072bc375043e81691a59454e09f0b38241ddd
master date: 2022-04-20 10:55:01 +0200
master commit: 9f735ee4903f1b9f1966bb4ba5b5616b03ae08b5
master date: 2022-05-25 11:09:46 +0200
xen/arch/x86/mm.c
xen/arch/x86/smp.c